A Novel Text Classification Approach Based on Enhanced Association Rule

نویسندگان

  • Jiangtao Qiu
  • Changjie Tang
  • Tao Zeng
  • Shaojie Qiao
  • Jie Zuo
  • Peng Chen
  • Jun Zhu
چکیده

The current research on association rule based text classification neglected several key problems. First, weights of elements in profile vectors may have much impact on generating classification rules. Second, traditional association rule lacks semantics. Increasing semantic of association rule may help to improve the classification accuracy. Focusing on the above problems, we propose a new classification approach. This approach include: (1) Mining frequent item-sets on item-weighted transactions; (2) Generating enhanced association rule that has richer semantics than traditional association rule. Experiments show that new approach outperforms CMAR, S-EM and NB algorithms on classification accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Single-Label Supervised Text Classification Approach based on Mining Association Rules

In this paper, we introduce a novel single-label supervised text classification approach based on mining association rules, called Apriori-TFP-TC. We follow the common framework of text mining in general, separating text classification into two stages, (1) text preprocessing and (2) the utilization of a selected data mining classification technique. We describe two text pre-processing technique...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Oil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)

Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007